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(54) TGF-beta induced gene family. 



(5?) A new gene family induced by TGF-beta is 
disclosed. Two new genes, designated (3IG-M1 
and PIG-M2, are induced in response to 
TGF-01 treatment of mouse embryo fibrob- 
lasts. These genes encode proteins containing 
about 345 to about 380 amino acid residues, 
with a molecular weight of about 37,000 to 
about 48,000 daltons and about 38 cysteine 
residues. The induced proteins share about 
50% homology with each other and significant 
homology with a v-src induced protein in chick- 
en embryo fibroblasts designated CEF-10. 
These proteins may be involved in producing 
some of the growth and differention modulating 
effects of TGF-pl. 
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TECHNICAL FIELD OF THE INVENTION 

The present invention is directed to the induction of a new gene family in response to TGF-beta adminis- 
tration to target cells in culture. Two specifically induced genes were isolated and characterized. 

5 

BACKGROUND OF THE INVENTION 

Transforming growth factor-p1 (TGF-pi) is a multifunctional regulator of cell growth and differentiation. It 
is capable of causing diverse effects such as inhibition of the growth of monkey kidney cells, (Tucker, R.F., 

io G.D. Shipley, H.L. Moses & R.W. Holley (1984) Science 226:705-707) inhibition of growth of several human 
cancer cell lines, (Roberts, A.B., M.A. Anzano, L.M. Wakefiled, N.S. Roches, D.F. Stem & M.B. Sporn (1985) 
Proc. Natl. Acad. Sci. USA 82:1 19-123; Ranchalis, J.E., L.Li. Gentry, Y. Agawa, S.M. Seyedin, J. McPherson, 
A. Purchio & D.R. Twardzik (1987) Biochem. Biophys. Res. Commun. 148:783-789) inhibition of mouse kerati- 
nocytes, (Coffey, R.J., N.J. Sipes, C.C. Bascum, R. Gravesdeal, C. Pennington, B.E. Weissman & H.L. Moses 

15 (1988) Cancer Res. 48: 1596-1602; Reiss, M. & C.L Dibble (1988) In Vitro Cell. Dev. Biol. 24:537-544) stimu- 
lation of growth of AKR-2B fibroblasts (Tucker, R.F., M.E. Olkenant, E.L. Branum & H.L. Moses (1988) Cancer 
Res. 43:1581-1586) and normal rat kidney fibroblasts, (Roberts, A.B., M.A. Anzano, L.C. Lamb, J.M. Smith & 
M.B. Sporn (1 981) Proc. Natl. Acad. Sci. USA 78:5339-5343) stimulation of synthesis and secretion of fibronec- 
tin and collagen, (Ignotz, R. A. & J. Massague (1986) J. Biol. Chem. 261:4337-4345; Centrella, M., T.L. McCar- 

20 thy & E. Canalis, (1987) J. Biol. Chem. 262:2869-2874) induction of cartilage-specific macromolecule 
production in muscle mesenchymal cells, (Seyedin, S. M., A. Y. Thompson, H. Bentz, D.M. Rosen, J. McPher- 
son, A. Contin, N.R. Siegel, G.R. Galluppi & K.A. Piez (1986) J. Biol. Chem. 261:5693-5695) and growth inhibi- 
tion of T and B lymphocytes. (Kehrl, J.H., L.M. Wakefiled, A.B. Roberts, S. Jakeoview, M. Alvarez-Mon, R. 
Derynck, M.B. Sporn & A.S. Fauci (1986) J. Exp. Med. 163:1037-1050; Kehrl, J.H., A.B. Roberts, L.M. 

25 Wakefield, S. Jakoview, M.B. Sporn & A.S. Fauci (1987) J. Immunol. 137:3855-3860; Kasid, A., G.I. Bell & E.P. 
Director, (1988) J. Immunol. 141:690-698; Wahl, S.M., D.A. Hunt, H.L. Wong, S. Dougherty, N. McCartney- 
Francis, L.M. Wahl, L. Ellingsworth, J.A. Schmidt, G. Hall, A.B. Roberts & M.B. Sporn (1988) J. Immunol. 
140:3026-3032) 

Recent investigations have indicad that TGF-p1 is a member of a family of closely related growth-modulat- 
30 ing proteins including TGF-B2, (Seyedin, S.M., P.R. Segarini, D.M. Rosen, A.Y. Thompson, H. Bentz & J. 
Graycar(1987) J. Biol. Chem. 262:1946-1949; Cheifetz, S., J.A. Weatherbee, M.L-S. Tsang, J.K. Anderson, 
J.E. Mole, R. Lucas & J. Massague (1987) Cell 48:409-415; Ikeda, T., M.M. Lioubin & H. Marquardt (1987) 
Biochemistry 26:2406-2410) TGF-P3, (TenDijke, P., P. Hansen, K. Iwata, C. Pieler& J.G. Foulkes(1988) Proc. 
Natl. Acad. Sci. USA 85:4715-4719; Derynck, R., P. Lindquist, A. Lee, D. Wen, J. Tamm, J.L. Graycar, L Rhee, 
35 A.J. Mason, D.A. Miller, R.J. Coffey, H.L. Moses & E.Y. Chen (1988) EMBO J. 7:3737-3743; Jakowlew, S.B., 
P.J. Dillard, P. Kondaiah, M.B. Sporn & A.B. Roberts (1988) Mol. Endocrinology. 2: 747-755) TGF-P4, (Jakow- 
lew, S. B., P. J. Dillard, M. B. Sporn & A.B. Roberts (1988) Mol. Endocrinology. 2:1 186-1 195) Mullerian inhibitory 
substance, (Cate, R.L., R.J. Mattaliano, C. Hession, R. Tizard, N.M. Faber, A. Cheung, E.G. Ninfa, A.Z. Frey, 
D.J. Dash, E.P. Chow, R.A. Fisher, J.M. Bertonis, G. Torres, B.P. Wallner, K.L. Ramachandran, R.C. Ragin, 
40 T.F. Manganaro, D.T. Maclaughlin & P.K, Donahoe (1986) Cell 45:685-698) and the inhibins. (Mason, A. J., 
J.S. Hayflick, N. Ling, F. Esch, N. Ueno, S.-Y. Ying, R. Guillemin, H. Niall & P.H. Seeburg (1985) Nature 
318:659-663) 

TGF-p1 is a 24-kDa protein consisting of two identical disulfide-bonded 12 kD subunits. (Assoian, R.K., A. 
Komoriya, C.A. Meyers, D.M. Miller & M.B. Sporn (1983) J. Biol. Chem. 258:7155-7160; Frolik, C.A., L.L. Dart, 

45 C.A. Meyers, D.M. Miller & M.B. Sporn (1983) Proc. Natl. Acad. Sci. USA 80:3676-3680; Frolik, C.A., L.M. 
Wakefiled, D.M. Smith & M.B. Sporn (1984) J. Biol. Chem. 259:10995-1 1000) Analysis of cDNA clones coding 
for human, (Derynck, R., J.A. Jarrett, E.Y. Chem, D.H. Eaton, J.R. Bell, R.K. Assoian, A.B. Roberts, M.B. Sporn 
& D.V. Goeddel (1985) Nature 316:701-705) murine, (Derynck, R., J.A. Jarrett, E.Y. Chem, a D.V. Goeddel 
(1986) J. Biol. Chem. 261:4377-4379) and simian (Sharpies, K., G.D. Plowman, T.M. Rose, D.R. Twardzik & 

50 A.F. Purchio (1987) DNA 6:239-244) TGF-pi indicates that this protein is synthesized as a larger 390 amino 
acid pre-pro-TGF-pi precursor; the carboxyl terminal 112 amino acid portion is then proteolytically cleaved to 
yield the TGF-pi monomer. 

Tho simian TGF-pi cDNA clone has been expressed to high levels in Chinese hamster ovary (CHO) cells. 
Analysis of the proteins secreted by these cells using sitespecific antipeptide antibodies, peptide mapping, and 

55 protein sequencing revealed that both mature and precursor forms of TGF-p were produced and were held 
together, in part, by a complex array of disulfide bonds. (Gentry, L.E., N.R. Webb, J. Lim, A. M. Brunner, J.E. 
Ranchalis. D.R. Twardzik. M.N. Lioubin, H. Marquardt & A. F. Purchio (1987) Mol. Cell Biol. 7:3418-3427; Gen- 
try, L.E., M.N. Lioubin, A.F. Purchio &H. Marquardt (1988) Mol. Cell. Biol. 8:4162-4168) Upon purification away 
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from the 24kD mature rTGF-|31, the 90 to 1 10 kD precursor complex was found to consist of three species: 
pro-TGF-|31, the pro-region of the TGF-p1 precursor, and mature TGF-pi. (Gentry, L.E., N.R. Webb, J. Lim, 
A.M. Brunner, J.E. Ranchalis, D.R. Twa.-dzik, M.N. Lioubin, H. Marquardt & A.F. Purchio (1987) Mol. Cell Biol. 
7:3418-3427; Gentry, L.E., M.N. Lioubin, A.F. Purchio & H. Marquardt (1988) Mol. Cell. Biol. 8:4162-4168) 
5 Detection of optimal biological activity required acidification before analysis, indicating that rTGF-pi was sec- 
reted in a latent form. 

The pro-region of the TGF-pi precursor was found to be glycosylated at three sites (Asn 82, Asn 136, and 
Asn 176) and the first two of these (Asn 82 and Asn 136) contain mannose-6-phosphate residues. (Brunner, 
AM., L.E. Gentry, J.A. Cooper & A.F. Purchio (1988) Mol. Cell Biol. 8:2229-2232; Purchio, A.F., J.A. Cooper, 

10 A.M. Brunner, M.N. Lioubin, LE. Gentry, K.S. Kovacina, R.A. Roth & H. Marquardt. (1988) J. Biol. Chem. 
263:14211-14215) In addition, the rTGF-pi precursor is capable of binding to the mannose-6-phosphate recep- 
tor and may impiy z mechanism for delivery to lysomes where proteolytic processing can occur. (Kc.-nfeid, S. 
(1986) J. Clin. Ivest. 77:1-6) 

TGF-P2 is also a 24-kD homodimer of identical disulfide-bonded 112 amino acid subunits (Marquardt, H., 

15 M.N. Lioubin & T. Ikeda (1987) J. Biol. Chem. 262:12127-12131). Analysis of cDNA clones coding for human 
(Madisen, L, N. R. Webb, T.M. Rose, H. Marquardt, T. Ikeda, D. Twardzik, S. Seyedin & A.F. Purchio. (1988) 
DNA 7:1-8; DeMartin, R., B. Plaendler, R. Hoefer-Warbinek, H. Gaugitsch, M. Wrann, H. Schlusener, J.M. 
Seifert, S. Bodmer, A. Fontana & E. Hoefer. EMBO J. 6:3673-3677) and simian (Hanks, S.K., R. Armour, J.H. 
Baldwin, F. Maldonado, J. Spiess & R.W. Holley (1988) Proc. Natl. Acad. Sci. USA 85:79-82) TGF-02 showed 

20 that it, too, is synthesized as a larger precursor protein. The mature regions of TGF-B1 and TGF-B2 show 70% 
homology, whereas 30% homology occurs in the proregion of the precursor. In the case of simian and human 
TGF-B2 precursor proteins differing by a 28 amino acid insertion in the pro-region; mRNA coding for these two 
proteins is thought to occur via differential splicing (Webb, N.R., L. Madisen, T.M. Rose & A.F. Purchio (1988) 
DNA 7:493-497). 

25 

SUMMARY OF THE INVENTION 

The present invention is directed to the induction in mammalian cells of a new family of genes in response 
to TGF-beta administration. The induced genes encode a class of similar proteins containing about 345 to about 

30 380 amino acid residues, having a molecular weight of about 37,000 daltons to about 45,000 daltons and con- 
taining about 38 cysteine residues. The cysteine residues are substantially conserved and these proteins share 
about 50% homology with each other. The induced gene products further share extensive homology with a pro- 
tein induced by v-src in chicken embryo fibroblasts. 

The present invention specifically discloses the induction by TGF-beta in mouse embryo cells of a gene 

35 family encoding proteins designated as BIG-M1 and PIG-M2 (beta-induced gene-mouse 1 and 2, respectively) 
that share about 80% and 50% homology, respectively with the CEF-10 protein induced by v-src in chicken 
embryo fibroblasts. The nucleotide sequences for PIG-M1 and PIG-M2 were elucidated and compared. The 
induction of the genes of the present invention by TGF-beta had not been previously reported or envisioned. 

40 DESCRIPTION OF THE FIGURES 

In the drawings: 

FIGURE 1 illustrates the nucleotide and deduced amino acid sequences of PIG-M1, and corresponds to 
Sequence I.D. No. 1. 

45 FIGURE 2 illustrates the nucleotide and deduced amino acid sequences of PIG-M2, and corresponds to 
Sequence I.D. No. 3. 

FIGURE 3 illustrates Northern Blot Analysis of PIG-M1 and BIG-M2 RNA. Total RNA was extracted from 
AKR-2B cells (Purchio and Fareed (1979) J. Virol. 29:763-769), fractionated on a 1% agarose-formaldehyde 
gel (Lehrach et al., ( 1 977) Biochemistry 16:4743-4751 ) and hybridized to [ 32 P]-labelled PIG-M1 (A) or PIG-M2 
so (C) probes. Lane 1 , AKR-2B; Lane 2, AKR-2B and TGF-pi ; Lane 3, AKR-2B and cyclohexamide; Lane 4, AKR- 
2B and cyclohexamide and TGF-P1 . The gels shown in panels A and C were stained with methylene blue and 
photographed (B and D) to show equal loading of RNAs. 

FIGURE 4 illustrates the alignment of amino acid residue sequences for plG-M1 and CEF-10 proteins. Resi- 
dues that are identical in both sequences are indicated by (:). 
55 FIGURE 5 illustrates the alignment of amino acid residue sequences for PIG-M2 and CEF-1 0 proteins. Resi- 

dues that are identical in both sequences are indicated by (:). 

FIGURE 6 illustrates the alignment of amino acid residue sequences for (3IG-M2 and 0IG-M1 proteins. Resi- 
dues that are identical in both sequences are indicated by (:). 
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FIGURE 7 illustrates the multiple sequence alignment of region II of CS protein. The alignment shown is 
between 8 protein sequences. An asterisk (*) indicated the positions where alignment is perfectly conserved, 
snd a dot (.) indicates those positions that are well conserved. 
The aligned regions represented are: 
5 . plG-M1 : amino acid residues 227-286 (60 residues) 

. CEF12CS (CEF10): amino acid residues 224-283 (60 residues) 
. 0IG-M2: amino acid residues 198-257 (60 residues) 

. PFALCIPACS (P. Falciparum CS protein region II) : amino acid residues 340-395 (55 residues) 
. PROPERDCSR (Properdin) : consensus of 6 repots (60 residues) 
w . THROMBOCS (Trombospondin) : repeat region, amino acid residues 420-476 (56 residues) 

. PFALTRAPCS (P. Falciparum TRAP) : amino acid residues 244-291 (48 residues) 
. C7COMPCS (C7 terminal complement motif) : amino acid residues 8-63 (56 residues) 
FIGURE 8 illustrates a Southern blot analysis of mouse genomic DNAwith pplG-M2. High molecular weight 
DNA was extracted from mouse kidneys, digested with Bam HI (lane 1), Eco Rl (lane 2), Hind III (lane 3) or 
15 Sstl (lane 4) and analyzed by Southern blotting with P 2 P]-labeled pplG-M2 (panel A) or p 2 P]-labeled p|$IG-M1 
(panel B). 

DESCRIPTION OF PREFERRED EMBODIMENTS 

20 The present invention is directed to the induction of a gene family by TGF-beta administration to target cells. 

The genes encode a family of proteins having about 345 to about 380 amino acid residues, having a molecular 

weight of about 37,000 daltons to about 45,000 daltons and containing about 38 cysteine residues. 

TGF-B1 is known to regulate the transcription of several genes, such as the genes encoding c-myc, c-sis, 

the receptor for platelet derived growth factor (PDGF) and TGF-betal. The proteins encoded by the TGF-betal 
25 induced genes are likely involved in mediation of the biological effects of TGF-betal relating to cell growh and 

differentiation. 

All amino acid residues identified herein are in the natural of L-configuration. In keeping with standard 
polypeptide nomenclature, abbreviations for amino acid residues are as follows: 



4 



EP 0 495 674 A2 



SYMBOL 



AMINO ACID 


3-Letter 


1 -Letter 


Alanine 


Ala 


A 


Arginine 


Arg 


R 


Asparagine 


Asn 


N 


Aspartic acid 


Asp 


D 


Aspartic acid or Asparagine 


Asx 


B 


Cysteine 


Cys 


C 


Glutamine 


Gin 


Q 


Glutamic acid 


Glu 


E 


Glycine 


Gly 


G 


Glutamic acid or Glutamine 


Glx 


Z 


Histidine 


His 


H 


Isoleucine 


lie 


I 


Leucine 


Leu 


L 


Lysine 


Lys 


K 


Methionine 


Met 


M 


Phenylalanine 


Phe 


F 


Proline 


Pro 


P 


Serine 


Ser 


S 


Threonine 


Thr 


T 


Tryptophan 


Trp 


W 


Tyrosine 


Tyr 


Y 


Valine 


Val 


V 



In the present invention it was found that when cells are treated with TGF-betal, at least one new class of 
genes was transcriptionally activated .This class of genes was establ ished by isolati ng the RNA from the treated 
35 cells, processing it, and then preparing cDNAfrom the RNA. The cDNA was further cloned and a library of genes 
prepared. 

As used herein, the term "library" refers to a large random collection of cloned DNA fragments obtained 
from the transcription system of interest. The gene library was then screened with labelled cDNA probes 
obtained from TGF-beta treated and untreated cells. This approach led to the detection of TGF-betal induced 
40 genes. 

In a preferred embodiment, mouse AKR-2B cells (obtained from Dr. H. Moses, Vanderbilt University, 
Nashville, TN. ) were treated with TGF-beta1 , and two new genes, designated PIG-M1 and PIG-M2, respect- 
ively, were elucidated. The coding sequences for these genes were obtained by cDNA cloning of the polyadeny- 
lated RNA isolated from the AKR-2B cells. The entire coding region was sequenced and then compared to 

45 known published sequences. The deduced amino acid sequences of the (3IG-M1 and PIG-M2 gene products 
demonstrated about 80% and 50% homology, respectively, with CEF-10, a gene induced by v-src in chicken 
embryo fibroblasts (Simmons et al. (1989) Proc. Natl.. Acad. Sci. USA. 86:1178). Comparison and alignment 
of the amino acid sequences of CEF-10 with PIG-M1 and plG-M2are shown in FIGURES 1 and 2, respectively. 
It is readily seen that significant homology exists between these proteins and that 38 of the 39 cysteine residues 

so are conserved. When piG-MI and PIG-M2 are compared with each other, approximately 50% homology is seen 
between the two sequences. (FIGURE 3) 

Upon further investigation it was found that the C-terminal cysteine rich domain of CEF-10, PIG-M1, and 
PIG-M2 contain an amino acid sequence motif with strong homology (9 of 12 fimino acids) to a motif found near 
the C-terminal of the malarial circumsporozoite (CS) protein. (FIGURE 7) This region of the CS protein, desig- 

55 nated 'region II', is highly conserved (10 of 12 amino acids) among all species of malarial parasites sequenced 
to date (Robson, K.J.H., et al. (1 988) Nature 335:79; Rich, K.A., et al. (1 990) Science 249:1 574). The CS protein 
is expressed on the surface of Plasmodium species during the sporozoite phase and may be involved in rec- 
ognition and entry into hepatocytes (Aley, S.B., et al. (1986) J. Exp. Med. 164 :1915). 
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The role of the region II motif in cell adhesion has been demonstrated by using peptide fragments of P.vivax 
CS protein to promote T-cell and myeloid cell line attachment to microtiter plates (Rich, K. A., et al. ( 1990) 
Science 249:1574). Furthermore, only peptides overlapping region II were able to inhibit T-cell and myeloid ceii 
lines from binding to the CS protein. 
5 The region II CS protein motif (CS motif is also found in other proteins which may have cell adhesive proper- 

ties that mediate cell-cell and cell-extracellular matrix interactions, such as properdin, thrombospondin; throm- 
bospondon related anonymous protein (TRAP) and various complement components. 

Properdin has 6 repeats containing the CS motif. Properdin is involved in stabilizing the 'alternate' pathway 
of complement which involves the binding of C3b to the surfaces of foreign organisms (Goundis, D. and Reid, 
10 K.B.M. (1988) Nature 335:82). 

Thrombospondin has 3 repeats of the CS motif. Data suggest it is a member of a class of adhesive proteins 
secreted by activated platelets and tissue culture cells, associating with the platelet membrane and becoming 
incorporated in fibrin clots and extracellular matrix (Lawler, J. and Hynes, R.O. (1986) J. Cell Bio. 103 :1635). 

TRAP is a surface antigen expressed during the blood stage of P. falciparum and may be involved in attach- 
15 ment to erythrocytes (possibly via C3b) prior to invasion (Robson, K.J.H., et al. (1988) Nature 335: 79). 

A comparison of the amino acid residue sequences of these proteins is shown in FIGURE 7, and demon- 
strates a high degree to conservation of the region II sequence. 

The N-terminus and the C-terminus of complement components C7, C8a, and C8B, and the N-terminus of 
C9 contain motifs that have weak homology to the CS motif (Goundis, D. and Reid, K.B.M. (1988) Nature 
20 335:82). 

Libraries of cDNA were generated in the present invention as a means to detect the induction of new genes 
by TGF-beta1. Double stranded cDNA containing EcoR1 cohesive termini was ligated into the unique Ecol clon- 
ing site present in Xgt 10 DNA. The recombinant DNA was then packaged into viable phage particles and plated 
on appropriate hosts ( E. coli strain C 6 oo rICmtCTiFI) for amplification and screening. 
25 X gt 1 0 is an insertion vector with a cloning capacity of up to 7 kb. The unique EcoR1 cloning site is located 
in the X repressor (cl) gene. Insertion of foreign DNA at this restriction site interrupts the cl coding sequence 
and causes the phenotype of the phage to change from cr (wild type) to cl~. Since cl~ phage are unable to 
lysogenize the host, clear plaques are produced by the recombinants. When plated on mutant bacteria which 
produce lysogeny, or bacteriophage integration, at a high frequency, only recombinant cl - phage produce pla- 
30 ques. Nonrecombinants, such as X gt 10 without an insert, are effectively suppressed from plaque formation. 
This has served in the present invention as the basis for the biological selection for recombinant phage during 
X gt 10 library amplification. 

Selection of the cloned sequences of interest in the present invention was carried out by screening the lib- 
rary with nucleic acid sequences derived from TGF-01 treated and untreated cells. This screening is dependent 
35 upon molecular hybridization by annealing of single-stranded nucleic acid molecules to form duplex structures 
that are stabilized by sequence-specific hydrogen bonds. Only nucleic acids of related sequence organization 
will base pair, or hybridize, with each other. 

Northern blot analysis as carried out in the present invention allows the detection of rare RNA molecules 
in a cell. In this technique, total cellular RNA is prepared and then resolved into different size classes 
40 electrophoretically. The resolved RNA is then transferred and probed with radiolabeled DNA, followed by 
radioautographic detection of DNA-RNA hybrid duplexes. 

The Northern blot technology was used in the present invention to further characterize 0IG-M1 and PIG-M2. 

The present invention is further described by the following Examples which are intended to be illustrative 
and not limiting. 

45 

EXAMPLE 1 

Isolation of 0IG-M1 and PIG-M2 

50 AKR-2B mouse cells, (obtained from Dr. H. Moses, Vanderbilt University, Nashville, TN.) were grown to 

confluency in McCoy's media (GIBCO BRL, Gaithersburg, MD) plus 5% fetal bovine serum (FBS). The cells 
were then treated with cyclohexamide (10 ug/ml) for 15 minutes. 

TGF-beta1 (10 ng/ml) was then added to the cells and the cells maintained for 6 hours at about 37°C with 
cyclohexamide and TGF-beta1. 

55 The RNA was extracted from the cells. Polyadenylated RNA (polyA-RNA) was isolated by passage of the 

extracted RNA through an oligo-dT cellulose column. The polyA-RNA was then used to prepare cDNA by use 
of reverse transcriptase. The cDNA was cloned into X gt 10 phage by using an EcoRI bridger according to the 
method of Webb, N.R. etal., 1987, DNA 6:71-79. 
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A DNA library was prepared and was then screened using two 32 P-labelled cDNA probes. The 32 P-labelled 
cDNA probes were prepared, respectively, from untreated AKR-2B mRNA and AKR-2B mRNA from cells 
treated with cyclohexamide and TGF-beta1. Hybridization of the probes with the DNA library to elicit plaques 
was carried out. Those plaques that had hybridized strongly with the probe from treated cells were isolated 

5 and further purified. The DNA from the tertiary plaques were cut with EcoR1 and then cloned into plasmid 
pEMBL18. Two clones (plG-M1 and 0IG-M2) were then sequenced. The sequences are shown in FIGURE 1 
and 2 (Sequence I.D. Nos. 1 and 3, respectively). 

Northern blot analysis of the mRNA from treated and untreated cells are shown in FIGURE 3. 0IG-M1 (Fig- 
ure 3A, lane 2) and PIG-M2 (Figure 3C, lane 2) RNAs were significantly increased in AKR-2B cells after a 6 

10 hour treatment with TGF-01. These RNA were barely detectable in untreated cells (Figures 3A and 3C, lane 
1). Both PIG-M1 and (3IG-M2 RNAs were increased by treatment with cyclohexamide alone (FIGURES 3Aand 
3C, lane 3) and were even further induced by treatment with the combination of cyclohexaminde and TGF-pi. 
(FIGURES 3A and 3C, lane 4). TGF-pi treatment in the presence of cyclohexamide increased plG-M2 RNA 
to a much higher extent (15 fold) than plG-M1 RNA (3 fold) over those values observed after cyclohexamide 

15 treatment alone. 

Southern blot analysis was carried out using mouse kidney DNA and clearly demonstrated that the two 
probes hybridized to different restriction fragments (FIGURE 8A and B) indicating that PIG-M1 and 0IG-M2 are 
encoded by different genes. It is readily seen that the administration of TGF-p1 in the presence of cyc- 
lohexamide significantly induces the production of mRNA for both plG-M1 and 0IG-M2 (FIGURE 3). A small 
20 amount of constitutive synthesis of these mRNAs is seen in the cyclohexamide treated cells. 

EXAMPLE 2 

Characterization of PIG-M1 and plG-M2 

25 

The amino acid residue sequences for PIG-M1 and PIG-M2 (sequence I.D. No. 2 and 4, respectively) were 
determined and compared. As shown in FIGURE 6 when the two protein sequences are aligned there is a 47. 7% 
homology between the sequences with conservation of 38 of the 39 cysteine residues. 

Comparison of the protein sequence with the v-src-induced gene product CEF-10 (Sequence I.D. No. 6) 
30 shows homology of about 80% with 0IG-M1 (Sequence I.D. No. 2) as seen in FIGURE 4, and of about 50% 
with PIG-M2 (Sequence I.D. No. 4) as seen in FIGURE 5. 

DNA sequence analysis of ppiG-M1 indicated that it contained a single open reading frame coding for a 
379 amino acid polypeptide. As stated above, this protein is about 80% homologous to CEF-10. It was further 
determined that PIG-M1 protein is identical to the protein encoded by cyr61, as described in O'Brien etal. (1990) 
35 Mol. Cell Biol. 10:3569-3577, an immediate early response gene induced in quiescent BALB 3T3 cells by serum 
treatment. 

DNA sequence analysis of ppiG-M2 (FIGURE 2) indicates a single open reading frame encoding a 348 
amino acid protein. The amino terminal portion of PIG-M2 contains a hydrophobic stretch which could function 
as a signal peptide. Beginning at amino acid residue 52 in FIGURE 2, PIG-M2 contains the sequence Gly-Cys- 

40 Gly-Cys-Cys-Arg-Val-Cys which conforms to the Gly-Cys-Gly-Cys-Cys-X-X-Cys motif reported in the amino 
half of insulin-like growth factor (IGF) binding proteins. (Binkert et al. (1988) EMBO J. 8:2497-2502; Albiston 
et al. (1990) Biochem. Biophys. Res. Commun. 16:892-897; Brinkman et al. (1988) EMBO J. 7:2417-2423). 
This motif is also present in PIG-M1 at amino acid residues 49 - 56 in Figure 1. 

The foregoing description and Examples are intended as illustrative of the present invention, but not as 

45 limiting. Numerous variations and modifications may be effected without departing from the true spirit and scope 
of the present invention. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: BRISTOL-MYERS SQUIBB COMPANY 
345 Park Avenue 
New York, New York 10154 
United States of America 

(ii) TITLE OF INVENTION: TGF-BETA INDUCED GENE FAMILY 



(iii) NUMBER OF SEQUENCES: 6 

15 

(iv) CORRESPONDENCE ADDRESS : 

(A) ADDRESSEE: Joseph M. Sorrentino 

(B) STREET: 3 005 First Avenue 

(C) CITY: Seattle 

20 (D) STATE: Washington 

(E) COUNTRY: USA 

(F) ZIP: 98121 

COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.24 

(Vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US unassigned 

(B) FILING DATE: 18-JAN-1991 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME:. Sorrentino, Joseph M. 

(B) REGISTRATION NUMBER: 32,598 

(C) REFERENCE /DOCKET NUMBER: ON0081- 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (206)728-4800 

(B) TELEFAX: (206)448-4775 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2028 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: N 



(v) 

25 
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(iv) ANTI-SENSE: N 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mus musculus 

(G) CELL TYPE: Fibroblast 

(H) CELL LINE: AKR2B 

(viii) POSITION IN GENOME: 

(C) UNITS: bp 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 186.. 1322 

(D) OTHER INFORMATION: 
(ix) FEATURE: 

(A) NAME/ KEY: matpeptide 

(B) LOCATION: 186.. 1322 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GACCGTGAGC GAGAGGCCCA GAGAAGCGCC TGCAATCTCT GCGCTCCTCC GCCAGCACCT 60 

CGAGAGAAGG ACACCCGCCG CCTCGGCCCT CGCCTCACCG CACTCCGGGC GCATTTGATC 120 

CCGCTGCTCG CCGGCTTGTT GGTTCTGTGT CGCCGCGCTC GCCCCGGTTC CTCCTGCGCG 180 

CCACA ATG AGC TCC AGC ACC TTC AGC ACG CTC GCT GTC GCC GTC ACC 2 27 

30 Met Ser Ser Ser Thr Phe Arg Thr Leu Ala Val Ala Val Thr 

15 10 

CTT CTC CAC TTG ACC AGA CTG GCG CTC TCC ACC TCC CCC GCC GCC TGC 2 75 

Leu Leu His Leu Thr Arg Leu Ala Leu Ser Thr CyB Pro Ala Ala Cys 
35 15 20 25 30 

CAC TGC CCT CTG GAG GCA CCC AAG TGC GCC CCG GGA GTC GGG TTG GTC 323 
Hia Cys Pro Leu Glu Ala Pro Lys Cya Ala Pro Gly Val Gly Leu Val 
35 40 45 

CGG GAC GGC TGC GGC TGC TGT AAG GTC TGC GCT AAA CAA CTC AAC GAG 371 
Arg Asp Gly Cys Gly CyB Cys Lys Val Cys Ala Lys Gin Lou Asn Glu 
50 55 60 

GAC TGC AGC AAA ACT CAG CCC TGC GAC CAC ACC AAG GGG TTG GAA TGC 419 
45 Asp Cys Ser Lys Thr Gin Pro Cys Asp His Thr Lys Gly Leu Glu Cys 
65 70 75 

AAT TTC GGC GCC AGC TCC ACC GCT CTG AAA GGG ATC TGC AGA GCT CAG 467 
Asn Phe Gly Ala Ser Ser Thr Ala Leu Lys Gly lie Cys Arg Ala Gin 
80 85 90 



40 



50 



TCA GAA GGC AGA CCC TGT GAA TAT AAC TCC AGA ATC TAC CAA AAC GGG 515 
Ser Glu Gly Arg Pro Cys Glu Tyr Asn Ser Arg He Tyr Gin Asn Gly 
95 100 105 110 
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GAA AGC TTC CAG CCC AAC TGT AAA CAC CAG TGC ACA TGT ATT GAT GGC 563 
Glu Ser Phe Gin Pro Abii Cys Lys His Gin Cys Thr Cys He Asp Gly 
115 120 125 

GCC GTG GGC TGC ATT CCT CTG TGT CCC CAA GAA CTG TCT CTC CCC AAT 611 
Ala Val Gly Cys lie Pro Leu Cya Pro Gin Glu Leu Ser Leu Pro Asn 
130 135 140 

CTG GGC TGT CCC AAC CCC CGG CTG GTG AAA GTC AGC GCG CAG TGC TGT 659 
Leu Gly Cya Pro Aon Pro Arg Leu Val Lye Val Ser Gly Gin Cya Cys 
145 150 155 

15 GAA GAG TGG GTT TGT GAT GAA GAC AGC ATT AAG GAC TCC CTG GAC GAC 707 

Glu Glu Trp Val Cya Asp Glu Asp Ser He Lys Asp Ser Leu Asp Asp 
160 165 170 



10 



20 



25 



30 



40 



45 



50 



CAG GAT GAC CTC CTC GGA CTC GAT GCC TCG GAG GTG GAG TTA ACG AGA 755 
Gin Aap Asp Leu Leu Gly Leu Asp Ala Ser Glu Val Glu Leu Thr Arg 
175 1B0 185 190 

AAC AAT GAG TTA ATC GCA ATT GGA AAA GGC AGC TCA CTG AAG AGG CTT 803 
Asn Asn Glu Leu He Ala He Gly Lys Gly Ser Ser Leu Lys Arg Leu 
195 200 205 

CCT GTC TTT GGC ACC GAA CCG CGA GTT CTT TTC AAC CCT CTG CAC GCC 851 
Pro Val Phe Gly Thr Glu Pro Arg Val Leu Phe Asn Pro Leu His Ala 
210 215 220 

CAT GGC CAG AAA TGC ATC GTT CAG ACC ACG TCT TGG TCC CAG TGC TCC 899 
His Gly Gin Lys Cys He Val Gin Thr Thr Ser Trp Ser Gin Cys Ser 
225 230 235 



AAG AGC TGC GGA ACT GGC ATC TCC ACA CGA GTT ACC AAT GAC AAC CCA 947 
Lys Ser Cys Gly Thr Gly He Ser Thr Arg Val Thr Asn Asp Asn Pro 
35 240 245 250 

GAG TGC CGC CTG GTG AAA GAG ACC CGG ATC TGT GAA GTG CGT CCT TGT 995 
GLu Cys Arg Leu Val Lys Glu Thr Arg He Cys Glu Val Arg Pro Cys 
255 260 265 270 



GGA CAA CCA GTG TAC AGC AGC CTA AAA AAG GGC AAG AAA TGC AGC AAG 1043 
Gly Gin Pro Val Tyr Ser Ser Leu Lys Lys Gly Lys Lys Cys Ser Lys 
275 280 285 

ACC AAG AAA TCC CCA GAA CCA GTC AGA TTT ACT TAT GCA GGA TGC TCC 1091 
Thr Lys Lys Ser Pro Glu Pro Val Arg Phe Thr Tyr Ala Gly Cys Ser 
290 295 300 

AGT GTC AAG AAA TAC CGG CCC AAA TAC TGC GGC TCC TGC GTA GAT GGC 1139 
Ser Val Lys LyB Tyr Arg Pro Lys Tyr Cys Gly Ser Cys Val Asp Gly 

305 310 315 

CGG TGC TGC ACA CCT CTG CAG ACC AGA ACT GTG AAG ATG CGG TTC CGA 1187 
Arg Cys Cya Thr Pro Leu Gin Thr Arg Thr Val Lys Met Arg Phe Arg 
320 325 330 
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5 TGC GAA GAT GGA GAG ATG TTT TCC AAG AAT GTC ATG ATG ATC CAG TCC 123 5 

Cya Glu Rap Gly Glu Met Phe Ser Lys Asn Val Met Met ILe Gin Ser 
335 340 345 350 

TGC AAA TGT AAC TAC AAC TGC CCG CAT CCC AAC GAG GCA TCG TTC CGA 1283 
Cys LyB Cya Aan Tyr Aan Cya Pro His Pro Aan Glu Ala Ser Phe Arg 
1 ° 355 360 365 

CTG TAC AGC CTA TTC AAT GAC ATC CAC AAG TTC AGG GAC TAAGTGCCTC 1332 
Leu Tyr Ser Leu Phe Asn Aap lie His Lya Phe Arg Asp 
370 375 



15 



20 



25 



30 



35 



40 



CAGGGTTCCT AGTGTGGGCT GGACAGAGGA GAAGCGCAAG CATCATGGAG ACGTGGGTGG 1392 

GCGGAGCATG AATGGTGCCT TGCTCATTCT TGAGTAGCAT TAGGGTATTT CAAAACTGCC 1452 

AAGGGGCTGA TGTGGACGGA CAGCAGCGCA GCCGCAGTTG GAGAATGCCA AGGGGCTGAT 1512 

GTGGACGGAC AGCAGCGCAG CCGCAGTTGG AGAAGACTTC GCTTCATAGT ACTGGAGCGG 1572 

GCATTATTGC TCCATATTGG AGCATGTTTA CGGATGACGT TCTGTTTTCT GTTTGTAAAT 1632 

TATTTGCTAA GTGTATTTTT TTGCTCCAGA CCCCCCCCCC CCCTTTCTTG GTTCTACAAT 1692 

TGTAATAGAG ACAAAATAAG ATTAGTTGGG CCAAGTGAAA GCCCTGCTTG TCCTTTGACA 1752 

GAAGTAAATG AAAGCGCCTC TCATTCCTTC CCGAGCGGAG GGGGGACACT CTGTGAGTGT 1812 

CCTTGGGGCA GCTACCTGCA CTCTAAAACT GCAAACAGAA ACCAGGTGTT TTAAGATTGA 1872 

ATGTTTTTTT ATTTATCAAA GTGTAGCTTT TGGGGAGGGA GGGGAAATGT AATACTGGAA 1932 

TAATTTG TAA ATGATTTTAA TTTTATATCA GTGAAGAGAA TTTATTTATA AAATTAATCA 1992 

TTTAATAAAG AAATATTTAC CTAAAAAAAA AAAAAA 2028 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 379 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ser Ser Ser Thr Phe Arg Thr Leu Ala Val Ala Val Thr Leu Leu 

15 10 15 

His Leu Thr Arg Leu Ala Leu Ser Thr Cys Pro Ala Ala Cys His Cys 

50 20 25 30 

Pro Leu Glu Ala Pro Lys Cys Ala Pro Gly Val Gly Leu Val Arg Asp 

35 40 45 

55 
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Gly Cys Gly Cys Cys Lys Val Cys Ala Lys Gin Leu Asn Glu Asp Cys 
50 55 60 

Ser Lys Thr Gin Pro Cys Asp His Thr Lys Gly Leu Glu Cys Asn Phe 
65 70 75 80 

Gly Ala Ser Ser Thr Ala Leu Lys Gly lie Cys Arg Ala Gin Ser Glu 
85 90 95 

Gly Arg Pro Cys Glu Tyr Asn Ser Arg lie Tyr Gin Asn Gly Glu Ser 
100 105 110 

Phe Gin Pro Asn Cys Lys His Gin Cys Thr Cys He Asp Gly Ala Val 
115 120 125 

Gly Cys He Pro Leu Cys Pro Gin Glu Leu Ser Leu Pro Asn Leu Gly 
130 135 140 

Cys Pro Asn Pro Arg Leu Val Lys Val Ser Gly Gin Cys Cys Glu Glu 
145 150 155 160 

Trp Val Cys Asp Glu Asp Ser He Lys Asp Ser Leu Asp Asp Gin Asp 
165 170 175 

Asp Leu Leu Gly Leu Asp Ala Ser Glu Val Glu Leu Thr Arg Asn Asn 
180 185 190 

Glu Leu lie Ala He Gly Lys Gly Ser Ser Leu Lys Arg Leu Pro Val 
30 195 200 205 

Phe Gly Thr Glu Pro Arg Val Leu Phe Asn Pro Leu His Ala His Gly 
210 215 220 

Gin Lys Cys He Val Gin Thr Thr Ser Trp Ser Gin Cys Ser Lys Ser 
35 225 230 235 240 

Cys Gly Thr Gly He Ser Thr Arg Val Thr Asn Asp Asn Pro Glu Cys 
245 250 255 



25 



40 



Arg Leu Val Lys Glu Thr Arg He Cys Glu Val Arg Pro Cys Gly Gin 
260 265 270 

Pro Val Tyr Ser Ser Leu Lys Lys Gly Lys Lys Cys Ser Lys Thr Lys 
275 280 285 

45 Lys Ser Pro Glu Pro Val Arg Phe Thr Tyr Ala Gly Cys Ser Ser Val 

290 295 300 

Lys Lys Tyr Arg Pro Lys Tyr Cys Gly Ser Cys Val Asp Gly Arg Cys 
305 310 315 320 



50 



Cys Thr Pro Leu Gin Thr Arg Thr Val Lys Met Arg Phe Arg Cys Glu 

325 330 335 
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Asp Gly Glu Met Phe Ser Lys Asn Val Met Met lie Gin Ser Cys Lys 
340 345 350 

Cys Asn Tyr Asn Cys Pro His Pro Asn Glu Ala Ser Phe Arg Leu Tyr 
355 360 365 

Ser Leu Phe Asn Asp lie His Lys Phe Arg Asp 
370 375 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2330 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: N 

(iv) ANTI-SENSE: N 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mus rousculus 

(G) CELL TYPE: Fibroblast 

(H) CELL LINE: AKR2B 

(viii) POSITION IN GENOME: 
30 (C) UNITS : bp 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 204.. 1247 
(D) OTHER INFORMATION: 

(ix) FEATURE: 

(A) NAME/ KEY : mat_peptide 

(B) LOCATION: 204.. 1247 
(D) OTHER INFORMATION: 



20 



25 



35 



40 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AGACTCAGCC AGATCCACTC CAGCTCCGAC CCCAGGAGAC CGACCTCCTC CAGACGGCAG 60 

CAGCCCCAGC CCAGCCGACA ACCCCAGACG CCACCGCCTG GAGCGTCCAG ACACCAACCT 120 

CCGCCCCTGT CCGAATCCAG GCTCCAGCCG CGCCTCTCGT CGCCTCTGCA CCCTGCTGTG 180 



CATCCTCCTA CCGCGTCCCG ATC ATG CTC GCC TCC GTC GCA GGT CCC ATC 2 30 

Mat Leu Ala Ser Val Ala Gly Pro lie 
50 15 
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AGC CTC GCC TTG GTG CTC CTC GCC CTC TGC ACC CGG CCT GCT ACG GGC 278 
Ser Leu Ala Leu Val Leu Leu Ala Leu Cys Thr Arg Pro ALa Thr GLy 
10 15 20 25 

CAG GAC TGC AGC GCG CAA TGT CAG TGC GCA GCC GAA GCA GCG CCG CAC 326 
Gin Asp Cys Ser Ala Gin Cys Gin Cys Ala Ala Glu Ala Ala Pro His 
30 35 40 

TGC CCC GCC GGC GTG AGC CTG GTG CTG GAC GGC TGC GGC TGC TGC CGC 374 
Cys Pro Ala Gly Val Ser Leu Val Leu Asp Gly Cys Gly Cys Cys Arg 
45 50 55 

GTC TGC GCC AAG CAG CTG GGA GAA CTG TGT ACG GAG CGT GAC CCC TGC 422 
Val Cya Ala Lya Gin Leu Gly Glu Leu Cys Thr Glu Arg Asp Pro Cys 
60 65 70 

GAC CCA CAC AAG GGC CTC TTC TGC GAT TTC GGC TCC CCC GCC AAC CGC 470 
Asp Pro His Lys Gly Leu Phe Cys ABp Phe Gly Ser Pro Ala Asn Arg 
75 80 85 

AAG ATT GGA GTG TGC ACT GCC AAA GAT GGT GCA CCC TGT GTC TTC GGT 518 
Lys lie Gly Val Cys Thr Ala Lys Asp Gly Ala Pro Cys Val Phe Gly 
90 95 100 105 

25 GGG TCG GTG TAC CGC AGC GGT GAG TCC TTC CAA AGC AGC TGC AAA TAC 566 
Gly Ser Val Tyr Arg Ser Gly Glu Ser Phe Gin Ser Ser Cys Lys Tyr 
110 115 120 



15 



20 



30 



35 



AO 



50 



CAA TGC ACT TGC CTG GAT GGG GCC GTG GGC TGC GTG CCC CTA TGC AGC 614 
Gin Cys Thr Cys Leu A8p Gly Ala Val Gly Cys Val Pro Leu Cys Ser 
125 130 135 

ATG GAC GTG CGC CTG CCC AGC CCT GAC TGC CCC TTC CCG AGA AGG GTC 662 
Met Asp Val Arg Leu Pro Ser Pro Asp Cys Pro Phe Pro Arg Arg Val 
140 145 150 

AAG CTG CCT GGG AAA TGC TGC GAG GAG TGG GTG TGT GAC GAG CCC AAG 710 
Lys Leu Pro Gly Lya Cys Cys Glu Glu Trp Val Cys Asp Glu Pro Lye 
155 160 165 

GAC CGC ACA GCA GTT GGC CCT GCC CTA GCT GCC TAC CGA CTG GAA GAC 758 
Aap Arg Thr Ala Val Gly Pro Ala Leu Ala Ala Tyr Arg Leu Glu Asp 
170 175 .180 185 



ACA TTT GGC CCA GAC CCA ACT ATG ATG CGA GCC AAC TGC CTG GTC CAG 806 
Thr Phe Gly Pro Asp Pro Thr Met Met Arg Ala Asn Cys Leu Val Gin 
45 190 195 200 

ACC ACA GAG TGG AGC GCC TGT TCT AAG ACC TGT GGA ATG GGC ATC TCC 854 
Thr Thr Glu Trp Ser Ala Cys Ser Lya Thr Cys Gly Met Gly lie Ser 
205 210 215 



ACC CGA GTT ACC AAT GAC AAT ACC TTC TGC AGA CTG GAG AAG CAG AGC 902 
Thr Arg Val Thr Asn Asp Asn Thr Phe Cys Arg Leu Glu Lya Gin Ser 
220 225 230 
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CGC CTC TGC ATG GTC AGG CCC TGC GAA GCT GAC CTG GAG GAA AAC ATT 95 0 

Arg Leu CyB Met Val Arg Pro Cys Glu Ala Asp Leu Glu Glu Aan He 
235 240 245 

AAG AAG GGC AAA AAG TGC ATC CGG ACA CCT AAA ATC GCC AAG CCT GTC 998 
Lys Lya Gly Lya Lya Cys He Arg Thr Pro Lya He Ala Lys Pro Val 
250 255 260 265 

AAG TTT GAG CTT TCT GGC TGC ACC AGT GTG AAG ACA TAC AGG GCT AAG 1046 
Lya Phe Glu Leu Ser Gly Cys Thr Ser Val Lya Thr Tyr Arg Ala Lya 
270 275 280 

TTC TGC GGG GTG TGC ACA GAC GGC CGC TGC TGC ACA CCG CAC AGA ACC 1094 
15 Phe Cya Gly Val Cys Thr Aap Gly Arg CyB Cya Thr Pro Hia Arg Thr 
235 290 295 

ACC ACT CTG CCA GTG GAG TTC AAA TGC CCC GAT GGC GAG ATC ATG AAA 1142 

Thr Thr Leu Pro Val Glu Phe Lya Cya Pro Asp Gly Glu lie Met Ly b 
20 300 305 310 

AAG AAT ATG ATG TTC ATC AAG ACC TGT GCC TGC CAT TAC AAC TGT CCT 1190 
Lys Asn Met Met Phe He Lys Thr Cya Ala Cys Hlb Tyr Aan Cys Pro 
315 320 325 

25 GGG GAC AAT GAC ATC TTT GAG TCC CTG TAC TAC AGG AAG ATG TAC GGA 1238 
Gly Asp Abii Asp He Phe Glu Ser Leu Tyr Tyr Arg Lya Met Tyr Gly 
330 335 340 345 

GAC ATG GCG TAAAGCCAGG AAGTAAGGGA CACGAACTCA TTAGACTATA 1287 

30 Aap Met Ala 

ACTTGAACTG AGTTGCATCT CATTTTCTTC TGTAAAAACA ATTACAGTAG CACATTAATT 1347 

TAAATCTGTG TTTTTAACTA CCGTGGGAGG AACTATCCCA CCAAAGTGAG AACGTTATGT 1407 

CATGGCCATA CAAGTAGTCT GTCAACCTCA GACACTGGTT TCGAGACAGT TTACACTTGA 14 67 

CAGTTGTTCA TTAGCGCACA GTGCCAGAAC GCACACTGAG GTGAGTCTCC TGGAACAGTG 15 2 7 

40 GAGATGCCAG GAGAAAGAAA GACAGGTACT AGCTGAGGTT ATTTTAAAAG -CAGCAGTGTG 1587 

CCTACTTTTT GGAGTGTAAC CGGGGAGGGA AATTATAGCA TGCTTGCAGA CAGACCTGCT 1647 

CTAGCGAGAG CTGAGCATGT GTCCTCCACT AGATGAGGCT GAGTCCAGCT GTTCTTTAAG 1707 

AACAGCAGTT TCAGCTCTGA CCATTCTGAT TCCAGTGACA CTTGTCAGGA GTCAGAGCCT 1767 

TGTCTGTTAG ACTGGACAGC TTGTGGCAAG TAAGTTTGCC TGTAACAAGC CAGATTTTTA 182 7 

TTGATATTGT AAATATTGTG GATATATATA TATATATATA TATATTTGTA CAGTTATCTA 188 7 

AGTTAATTTA AAGTCATTTG TTTTTGTTTT AAGTGCTTTT GGGATTTTAA ACTGATAGCC 194 7 

TCAAACTCCA AACACCATAG GTAGGACACG AAGCTTATCT GTGATTCAAA ACAAAGGAGA 2007 
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TACTGCAGTG GGAATTGTGA CCTGAGTGAC TCTCTGTCAG AACAAACAAA TGCTGTGCAG 2067 

GTGATAAAGC TATGTATTGG AAGTCAGATT TCTAGTAGGA AATGTGGTCA AATCCCTGTT 212 7 

GGTGAACAAA TGGCCTTTAT TAAGAAATGG CTGGCTCAGG GTAAGGTCCG ATTCCTACCA 2187 

GGAAGTGCTT GCTGCTTCTT TGATTATGAC TGGTTTGGGG TGGGGGGCAG TTTATTTGTT 2247 

G AGAGTG TGA CCAAAAGTTA CATGTTTGCA CTTTCTAGTT GAAAATAAAG TATATATATA 2307 

TTTTTATATG AAAAAAAAAA AAA 2330 

(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 348 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

25 Met Leu Ala Ser Val Ala Gly Pro lie Ser Leu Ala Leu Val Leu Leu 

15 10 15 

Ala Leu Cys Thr Axg Pro Aid Thr Gly Gin Asp Cys Ser Ala Gin Cys 
20 25 30 

30 

Gin Cys Ala Ala GLu Ala Ala Pro His CyB Pro Ala Gly Val Ser Leu 
35 40 45 

Val Leu Asp Gly Cys Gly Cys Cys Arg Val Cys Ala Lys Gin Leu Gly 
50 55 60 

35 

GLu Leu Cys Thr Glu Arg Asp Pro Cys Asp Pro His Lys Gly Leu Phe 
65 70 75 80 

Cys Asp Phe Gly Ser Pro Ala Asn Arg Lys lie Gly Val Cys Thr Ala 
40 85 90 95 

Lys Asp Gly Ala Pro Cys Val Phe Gly Gly Ser Val Tyr Arg Ser GLy 
100 105 110 



45 



50 



Glu Ser Phe Gin Ser Ser Cys Lya Tyr Gin Cys Thr Cys Leu Asp Gly 
115 120 125 

Ala Val Gly Cys Val Pro Leu Cys Ser Met Asp Val Arg Leu Pro Ser 
130 135 140 

Pro Aop Cys Pro Phe Pro Arg Arg Val Lys Leu Pro Gly Lys Cys Cys 
145 150 155 L60 

Glu Glu Trp Val Cys Asp Glu Pro Lys Asp Arg Thr Ala Val Gly Pro 
165 170 175 
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Ala Leu Ala Ala Tyr Arg Leu Glu Asp Thr Phe Gly Pro Asp Pro Thr 
180 185 190 

Met Met Arg Ala Asn Cys Leu Val Gin Thr Thr Glu Trp Ser Ala Cys 
195 200 205 

Ser Lys Thr Cys Gly Met Gly lie Ser Thr Arg Val Thr Asn Aap Asn 
210 215 220 

Thr Phe Cys Arg Leu Glu Lys Gin Ser Arg Leu Cys Met Val Arg Pro 
225 230 235 240 

Cys Glu Ala Asp Leu Glu Glu Aan He Lys Lys Gly Lys Lys Cys lie 
245 250 255 

Arg Thr Pro Lys He Ala Lys Pro Val Lys Phe Glu Leu Ser Gly Cys 
260 265 270 

Thr Ser Val Lys Thr Tyr Arg Ala Lys Phe Cye Gly Val Cys Thr Aap 
20 275 280 285 

Gly Arg Cys Cys Thr Pro His Arg Thr Thr Thr Leu Pro Val Glu Phe 
290 295 300 



f5 



25 



30 



Lys Cys Pro Asp Gly Glu lie Met Lys Lys Asn Met Met Phe lie Lys 
305 310 315 320 

Thr Cys Ala Cys Hia Tyr Asn Cys Pro GLy Asp Asn Asp He Phe Glu 
325 330 335 

Ser Leu Tyr Tyr Arg Lys Met Tyr Gly Asp Met Ala 
340 345 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1804 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: N 

(iv) ANTI-SENSE: N 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Gallus domesticus 
(G) CELL TYPE: Fibroblast 
'(H) CELL LINE: CEF10 

(viii) POSITION IN GENOME: 
(C) UNITS: bp 
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(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION : 53.. 117 7 
(D) OTHER INFORMATION: 

(ix) FEATURE: 

(A) NAME/ KEY : matjpeptide 

(B) LOCATION: 119.. 1177 
(D) OTHER INFORMATION: 

(ix) FEATURE: 

(A) NAME/ KEY : sig_peptide 

(B) LOCATION: 53.. 118 
(D) OTHER INFORMATION: 

(X) PUBLICATION INFORMATION: 

(A) AUTHORS: Simmons , Daniel L 
Levy, Daniel B 
Yannoni , Yvonne 
Erikson, R L 

(B) TITLE: Identification of a phorbal ester- 
repressible 

v-src- inducible gene 

(C) JOURNAL: Proc. Natl. Acad. Sci. U.S.A. 

(D) VOLUME : 86 

(F) PAGES: 1178-1182 

25 (G) DATE: February-1989 

(K) RELEVANT RESIDUES IN SEQ ID NO : 5 : FROM 1 TO 



15 



20 



30 



45 



50 



1804 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



CCCCCTTCGC GATCGCGTCT CGAGCTCCGC TCTCGCTCCG CGCCGCTAAG AC ATG 55 

Met 
22 

35 GGC TCT GCG GGA GCr CGC CCC GCG CTG GCG GCC GCC CTG CTC TGC CTG 103 

Gly Ser Ala Gly Ala Arg Pro Ala Leu Ala Ala Ala Leu Leu Cys Leu 
-20 -15 -10 

GCC CGC CTG GCT CTC GGC TCT CCG TGC CCC GCC GTC TGC CAG TGC CCG 151 
40 Ala Arg Leu Ala Leu Gly Ser Pro Cys Pro Ala Val Cys Gin Cys Pro 

-5 1 5 10 

GCC GCC GCG CCG CAG TGC GCC CCG GGC GIG GGG UG GTG CCG GAC GGC 199 
Ala Ala Ala Pro Gin Cys Ala Pro Gly Val Gly Leu Val Pro Asp Gly 
15 20 25 



TGC GGC TGC TGC AAG GTC TGC GCC AAG CAG CTG AAC GAG GAC TGC AGC 247 
Cys Gly Cys Cys Lys Val Cys Ala Lys Gin Leu Asn Glu Asp Cys Ser 
30 35 ' 40 

CGG ACG CAG CCC TGC GAC CAC ACC AAG GGG CTG GAG TGC AAC TTC GGC 295 
Arg Thr Gin Pro Cys Asp His Thr Lys Gly Leu Glu Cys Asn Phe Gly 
45 50 55 



55 
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GCC AGC CCC GCC GCC ACC AAC GGC ATC TGC AGA GCA CAG TCT GAG GGG 
Ala Ser Pro Ala Ala Thr Asn Gly lie Cys Arg Ala Gin Ser Glu Gly 
60 65 70 75 



343 



AGA CCA TGC GAA TAC AAC TCC AAA ATC TAC CAG AAC GGC GAA AGC TTC 
Arg Pro Cys Glu Tyr Asn Ser Lys lie Tyr Gin Asn Gly Glu Ser Phe 
80 85 90 



391 



10 



CAG CCC AAC TGC AAG CAC CAG TGT ACG TGC ATA GAT GGA GCT GTG GGC 
Gin Pro Asn Cys Lys His Gin Cys Thr Cys He Asp Gly Ala Val Gly 
95 100 105 



439 



15 



TGC ATC CCG CTC TGC CCG CAG GAG CTC TCC CTC CCC AAC CTG GGC TGC 
Cys tie Pro leu Cys Pro Gin Glu Leu Ser Leu Pro Asn Leu Gly Cys 
110 115 120 



487 



20 



25 



CCC AGC CCC AGG CTG GTC AAA GTG CCT GGG CAG TGC TGC GAG GAG TGG 535 
Pro Ser Pro Arg Leu Val Lys Val Pro Gly Gin Cys Cys Glu Glu Trp 
125 130 135 

GTC TGC GAT GAG AGC AAG GAT GCG CTG GAG GAG CTG GAG GGC TTC TTC 583 
Val Cys Asp Glu Ser Lys Asp Ala Leu Glu Glu Leu Glu Gly Phe Phe 
140 145 150 155 

AGC AAG GAG TTT GGT CTG GAC GCT TCT GAG GGC GAA CTG ACC CGG AAC 631 
Ser Lys Glu Phe Gly Leu Asp Ala Ser Glu Gly Glu Leu Thr Arg Asn 
160 165 17X> 



30 



AAC GAG CTG ATT GCC ATC GTG AAG GGA GGC CTG AAG ATG CTA CCT GTT 
Asn Glu Leu He Ala lie Val Lys Gly Gly Leu Lys Met Leu Pro Val 
175 180 185 



679 



35 



TTT GGA TCC GAG CCG CAA AGC CGA GCT TTT GAG AAT CCC AAA TGC ATT 
Phe Gly Ser Glu Pro Gin Ser Arg Ala Phe Glu Asn Pro Lys Cys He 
190 195 200 



727 



40 



GTG CAA ACA ACT TCC TGG TCC CAG TGC TCA AAG ACG TGT GGG ACC GGC 775 
val Gin Thr Thr Ser Trp Ser Gin Cys Ser Lys Thr Cys Gly Thr Gly 
205 210 215 

ATC TCC ACC AGA GTC ACC AAC GAC AAT CCC GAC TGC AAG CTC ATC AAA 823 
lie Ser Thr Arg Val Thr Asn Asp Asn Pro Asp Cys Lys Leu lie Lys 
220 225 230 235 



45 



GAG ACC AGG ATA TGC GAA GTG AGG CCG TGT GGC CAG CCC AGC TAC GCC 
Glu Thr Arg He Cys Glu Val Arg Pro Cys Gly Gin Pro Ser Tyr Ala 

240 245 250 



871 



50 



TCC CTG AAG AAG GGA AAA AAA TGT ACC AAG ACT AAG AAG TCC CCA TCC 
Ser Leu Lys Lys Gly Lys Lys Cys Thr Lys Thr Lys Lys Ser Pro Ser 
255 260 265 



919 



55 



CCT GTA AGG TTT ACT TAT GCT GGA TGC TCC. AGT GTG AAG AAG TAC CGC 
Pro Val Arg Phe Thr Tyr Ala Gly Cys Ser Ser Val Lys Lys Tyr Arg 
270 275 280 



967 
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15 



20 



40 



45 



50 



CCC AAG TAC TGT GGG TCT TGC GTG GAT GGC AGG TGC TGr ACT CCC CAG 1015 
Pro Lys Tyr Cys Gly Ser Cys Val Asp Gty Arg Cys Cys Thr Pro Gin 
285 290 295 

CAG ACC AGG ACT GTC AAG ATC CGT TTC CGC TGC GAT GAT GGA GAA ACC 1063 
Gin Thr Arg Thr Val Lys He Arg Phe Arg Cya Asp Asp Gly Glu Thr 
300 305 310 315 

TTC ACC AAG ACT GTC ATG ATG ATC CAG TCC TGC CGC TGC AAC TAC AAC 1111 
Phe Thr Lys Ser Val Met Net lie Gin Ser Cys Arg Cys Asn Tyr Asn 
320 325 330 

TGT CCG CAT GCA AAC GAA GCT TAT CCC TTC TAC AGA CTG GTC AAT GAC 1159 
Cys Pro His Ala Asn Glu Ala Tyr Pro Phe Tyr Arg Leu Val Asn Asp 
335 340 345 

ATC CAC AAA TTT AGG GAC TAAGTGGTAT TTGGGGTGGG ATGTTAAACA 1207 
He His Lys Phe Arg Asp 
350 



GAATTCTGAA GTAACCAGCC ATGGAGAAAG GACCTCTGAT GGAAGTGGTG CCTTGCCCCA 1267 

TTTGAGGGCA ATATGAGATA TTACAGGAGT GCACTGTGCA ACTGGACACT AATGCGACAG 1327 

25 AGATTTAAGC ATACTTAAAG CTTCATAGTA CTGCAGCAAC CTTACTGCTT CITTTTGGAG 1387 

CACCTTTATC TTACACTGTT TTCTGTTTGT AAGTGATCTG ATGTTTTGTT CCGGTTATGA 1447 

AAGCTCTTCC TCTCCCGTTC AGTTTAACAC TACGCTTTTC CCCTCCCCTC CATCTTCTCC 1507 

30 

CCTACTCTCC CAACCAAGTT GGAAGTTACA TTCCTTCCTG AGGTGGGCAC TTGTGGGGTG 1567 

TTCACAGTGG CAGCTATTAT GTACCAACTG TAGTTTAATG GCAAACAGAA ATCAGTTGTT 1627 

35 TTAAAGCTGA GTATTTTATT TATCAAACTG TAGCTCTTTT GTTTTCTTTT TTTTTTTTTT 1687 

TAACCCCTTC CAACCCCTGT AATACTGGAA TAAGTTGTAA ATGATTTTAA TTTTATATTC 1747 

GATGAATTAA AAGAATTTAT TTATGGAATT AATCATTTAA TAAAGAAATA TTTACCT 1804 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 375 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met- Gly Ser Ala Gly Ala Arg Pro Ala Leu Ala Ala Ala Leu Leu Cya 

-22 -20 -15 -10 



55 
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15 



20 



Leu ALa Arg Leu Ala Leu Gly Ser Pro Cys Pro Ala Val Cys Gin Cya 
-5,1 5 10 

Pro Ala ALa Ala Pro Gin Cya Ala Pro Gly Val Gly Leu Val Pro Asp 
15 20 25 

Gly Cys Gly Cya Cya Lys Val Cya Ala Lye Gin Leu Asn Glu Asp CyB 
30 35 40 

Ser Arg Thr Gin Pro Cya Asp Hie Thr Lys Gly Leu Glu Cys Asn Phe 
45 50 55 

Gly Ala Ser Pro Ala Ala Thr Asn Gly He Cya Arg Ala Gin Ser Glu 
60 65 70 

Gly Arg Pro Cya Glu Tyr Asn Ser Lys He Tyr Gin Asn Gly Glu Ser 
75 80 85 90 

Phe Gin Pro Asn Cys Lye His Gin Cye Thr Cys He Asp Gly Ala Val 
95 100 105 

Gly Cye He Pro Leu Cys Pro Gin Glu Leu Ser Leu Pro Asn Leu Gly 
HO 115 120 

Cys Pro Ser Pro Arg Leu Val Lys Val Pro Gly Gin Cys Cys Glu Glu 
125 130 135 

Trp val Cys Asp Glu Ser Lys Asp Ala Leu Glu Glu Leu Glu Gly Phe 
140 145 150 

Phe Ser Lys Glu Phe Gly Leu Asp Ala Ser Glu Gly Glu Leu Thr Arg 
155 160 165 170 

Asn Asn Glu Leu He Ala He Val Lys Gly Gly Leu Lys Met Leu Pro 
175 180 185 

Val Phe Gly Ser Glu Pro Gin Ser Arg Ala Phe Glu Asn Pro Lys Cys 
190 195 200 

He Val Gin Thr Thr Ser Trp Ser Gin Cya Ser Lys Thr Cys Gly Thr 
40 205 210 215 

Gly He Ser Thr Arg Val Thr Asn Asp Asn Pro Asp Cys Lys Leu He 
220 225 230 

45 Lys Glu Thr Arg He Cys Glu Val Arg Pro Cys Gly Gin Pro Ser Tyr 

235 240 245 250 

Ala Ser Leu Lys Lys Gly Lys Lys Cys Thr Lys Thr Lys Lys Ser Pro 
255 260 265 



25 



30 



35 



50 



Ser Pro Val Arg Phe Thr Tyr Ala Gly CyB Ser Ser Val Lys Lys Tyr 
270 275 280 



55 
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Arg Pro ,Lys Tyr Cys Gly Ser Cys Val Asp Gly Arg Cys Cye Thr Pro 



/ 285 



290 



295 



Gin Gin Thr Arg Thr Val Lys He Arg Phe Arg Cye Asp Asp Gly Glu 



300 



305 



310 



Thr Phe Thr Lye Ser Val Met Met lie Gin Ser Cys Arg Cys Aan Tyr 



315 



320 



325 



330 



10 



Aan Cys Pro Hie Ala Asn Glu Ala Tyr Pro Phe Tyr Arg Leu Val Aan 



335 



340 



345 



15 



Asp He His Lye Phe Arg Asp 
350 



Claims 

20 

I. A substantially purified protein comprising about 345 to about 380 amino acid residues, having a molecular 
weight of about 37,000 daltons to about 45,000 daltons and containing about 38 cysteine residues, said 
protein being induced by TGF-beta administration to mammalian cells. 

25 2. The protein according to Claim 1, wherein the protein has an amino acid residue sequence substantially 
corresponding to the sequence depicted in FIGURE 1 designated as PIG-M1 and having Sequence I.D. 
No. 2. 

3. The protein according to Claim 1 , wherein the protein has an amino acid residue sequence substantially 
30 corresponding to the sequence depicted in FIGURE 2 designated as BIG-M2 and having Sequence I.D. 

No. 4. 

4. The protein according to Claim 2 encoded by a nucleotide sequence substantially corresponding to the 
sequence of FIGURE 1 and having Sequence I.D. No. 1. 

35 

5. The protein according to Claim 3 encoded by a nucleotide sequence substantially corresponding to the 
sequence of FIGURE 2 and having Sequence I.D. No. 3. 

6. A nucleotide sequence encoding a TGF-beta induced protein substantially corresponding to the nucleotide 
40 sequence depicted in FIGURE 1 and having Sequence I.D. No. 1. 

7. A nucleotide sequence encoding a TGF-beta-induced protein substantially corresponding to the nuc- 
leotide sequence depicted in FIGURE 2 and having Sequence I.D. No. 3. 

45 8. A gene family induced by TGF-beta wherein the induced genes encode a protein comprising about 345 
to about 380 amino acid residues, having a molecular weight of about 37,000 daltons to about 45,000 dal- 
tons and containing about 38 cysteine residues. 

9. The gene family according to Claim 8 wherein an induced gene encodes a protein having an amino acid 
so residue sequence substantially corresponding to the sequence depicted in FIGS 1 and having Sequence 

I.D. No. 2. ' 

10. The gene family according to Claim 8 wherein an induced gene encodes a protein having an amino acid 
residue sequence substantially corresponding to the sequence depicted in FIGS 2 and having Sequence 

55 I.D. No. 4. 

II. The gene family according to Claim 8 wherein an induced gene has a nucleotide sequence substantially 
corresponding to the sequence depicted in FIGURE 1 and having Sequence I.D. No. 1. 
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The gene family according to Claim 8 wherein an induced gene has a nucleotide sequence substantially 
corresponding to the sequence depicted in FIGURE 2 and having Sequence I.D. No. 3. 

A method for the determination of a TGF-p induced gene comprising the steps of: 

(1) treating a mammalian cell with an effective amount of an inhibitor of mRNA translation for a time 
period sufficient to inhibit protein synthesis; 

(2) further treating said mammalian cell with an effective amount of TGF-P for a time period sufficient 
to induce mRNA synthesis from TGF-P inducible genes; 

(3) preparing a cDNA library from mRNA isolated from the cell treated according to steps (1) and (2); 

(4) probing the cDNA library with cDNA isolated from the untreated mammalian cell of step (1); 

(5) probing the cDNA library with cDNA isolated from the mammalian cell treated according to steps 
(1) and (2); 

(6) selecting a cDNA detectted in step (4) but not in step (5); and 

(7) sequencing the DNA selected in step (6). 

A method for the production of a protein according to any one of claims 1 to 5 comprising the steps of: 

(1) inserting a nucleic acid coding sequence encoding the protein into an expression vector; 

(2) transforming or transfecting a mammalian cell with the expression vector; 

(3) culturing the mammalian cell to express the protein; and 

(4) isolating the protein. 
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IIG-M CONSENSUS 112790 

GACC6TGAGC GAGAGGCCCA GAGAAGCGCC TGCAATCTCT GCGCTCCTCC GCCAGCACCT 60 

CGAGAGAAGG ACACCCGCCG CCTCGGCCCT CGCCTCACCG CACTCCGGGC GCATTTGATC 120 

CCGCTGCTCG CCGGCTTGTT GGTTCTGTGT CGCCGCGCTC GCCCCGGTTC CTCCTGCGCG 180 

CCACA ATG AGC TCC AGC ACC TTC AGG ACG CTC GCT GTC GCC GTC ACC 227 
Met Ser Ser Ser Thr Phe Arg Thr Leu Ala Val Ala Val Thr 
1 5 io 

CTT CTC CAC TTG ACC AGA CTG GCG CTC TCC ACC TGC CCC GCC GCC 272 
Leu Leu His Leu Thr Arg Leu Ala Leu Ser Thr Cys Pro Ala Ala 
15 20 25 

TGC CAC TGC CCT CTG GAG GCA CCC AAG TGC GCC CCG GGA GTC GGG 317 
Cys His Cys Pro Leu Glu Ala Pro Lys Cys Ala Pro Gly Val Glv 
30 35 40 

TTG GTC CGG GAC GGC TGC GGC TGC TGT AAG GTC TGC GCT AAA CAA 362 
Leu Val Arg Asp Gly Cys Gly Cys Cys Lys Val Cys Ala Lys Gin 
45 50 55 

CTC AAC GAG GAC TGC AGC AAA ACT CAG CCC TGC GAC CAC ACC AAG 407 
Leu Asn Glu Asp Cys Ser Lys Thr Gin Pro Cys Asp His Thr Lys 
60 65 70 

GGG TTG GAA TGC AAT TTC GGC GCC AGC TCC ACC GCT CTG AAA GGG 452 
Gly Leu Glu Cys Asn Phe Gly Ala Ser Ser Thr Ala Leu Lys Gly 
7 5 BO 85 

" _ t 

ATC TGC AGA GCT CAG TCA GAA GGC AGA CCC TGT GAA TAT AAC TCC 497 
He Cys Arg Ala Gin Ser Glu Gly Arg Pro Cys Glu Tyr Asn Ser 
90 95 100 

AGA ATC TAC CAA AAC GGG GAA AGC TTC CAG CCC AAC TGT AAA CAC 542 
Arg IU Tyr Gin Asn Gly Glu Ser Phe Gin Pro Asn Cys Lys His 
105 no us 

CAG TGC ACA TGT ATT GAT GGC GCC GTG GGC TGC ATT CCT CTG TGT 587 
Gin Cys Thr Cys lie Asp Gly Ala Val Gly Cys He Pro Leu Cys 
120 125 130 



FIGURE 1 
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52 SI T ? I CT CTC CCC ** T CT6 TGT CCC ** CCC CGG 
Pro Gin Glu Leu Ser Leu Pro Asn Leu Gly Cys Pro Asn Pro Arg 

135 "0 14S 

Let vlf f** f ° GGG CAG T£C TGT «* WG TGG GTT TGT GAT 
Leu Vol Lys v.l Ser Gly Gin Cys Cys Glu Glu Trp Val Cys Asp 

GAA GAC AGC ATT AAG GAC TCC CTG GAC GAC CAG GAT GAC CTC CTC 
Glu Asp Ser He Lys Asp Ser Leu Asp Asp Gin Asp Asp Leu Leu 
165 170 175 

GGA CTC GAT GCC TCG GAG GTG GAG TTA ACG AGA AAC AAT GAG TTA 
Gly Leu Asp Ala Ser Glu Val Glu Leu Thr Arg Asn Asn Glu Leu 
180 185 190 

ATC GCA ATT GGA AAA GGC AGC TCA CTG AAG AGG CTT CCT GTC TTT 
He Ala He Gly Lys Gly Ser Ser Leu Lys Arg Leu Pro Val Phe 
195 200 205 

GGC ACC GAA CCG CGA GTT CTT TTC AAC CCT CTG CAC GCC CAT GGC 
Gly Thr Glu Pro Arg Val Leu Phe Asn Pro Leu His Ala His Gly 
210 215 220 

CAG AAA TGC ATC GTT CAG ACC ACG TCT TGG TCC CAG TGC TCC AAG 
Gin Lys Cys He Val Gin Thr Thr Ser Trp Ser Gin Cys Ser Lys 
225 230 235 

AGC TGC GGA ACT GGC ATC TCC ACA CGA GTT ACC AAT GAC AAC CCA 

Ser Cys Gly Thr Gly He Ser Thr Arg Val Thr Asn Asp Asn Pro 
2"0 245 250 

t 

GAG TGC CGC CTG GTG AAA GAG ACC CGG ATC TGT GAA GTG CGT CCT 

Glu Cys Arg Leu Val Lys Glu Thr Arg He Cys Glu Val Arg Pro 
255 260 265 

TGT GGA CAA CCA GTG TAC AGC AGC CTA AAA AAG GGC AAG AAA TGC 
Cys Gly Gin Pro Val Tyr Ser Ser Leu Lys Lys Gly Lys Lys Cys 
270 275 280 

AGC AAG ACC AAG AAA TCC CCA GAA CCA GTC AGA TTT ACT TAT GCA 
Ser Lys Thr Lys Lys Ser Pro Glu Pro Val Arg Phe Thr Tyr Ala 
285 290 295 



632 



677 



722 



767 



812 



857 



902 



947 



992 



1037 



1082 



FIGUR E 1 fCont.l 
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GGA TGC TCC AGT GTC AAG AAA TAC CGG CCC AAA TAC TGC GGC TCC 1127 
Gly Cys Ser Ser V.I Lys Lys Tyr Arg Pro Lys Tyr Cys Gly Ser 
300 305 310 

TGC GTA GAT GGC CGG TGC TGC ACA CCT CTG CAG ACC AGA ACT GTG 1172 
Cys Val Asp Gly Arg Cys Cys Thr Pro Leu Gin Thr Arg Thr v a i 
315 320 325 



AAG ATG CGG TTC CGA TGC GAA GAT GGA GAG ATG TTT TCC AAG AAT 12 17 
Lys Met Arg Phe Arg Cys Glu Asp Gly Glu Met Phe Ser Lys Asn 
330 335 34 0 

GTC ATG ATG ATC CAG TCC TGC AAA TGT AAC TAC AAC TGC CCG CAT 1262 
Val Met Met He Gin Ser Cys Lys Cys Asn Tyr Asn Cys Pro His 
345 350 355 

CCC AAC GAG GCA TCG TTC CGA CTG TAC AGC CTA TTC AAT GAC ATC 1307 
Pro Asn Glu Ala Ser Phe Arg Leu Tyr Ser Leu Phe Asn Asp He 
360 365 37o 

CAC AAG TTC AGG GAC TAAGTGCCTC CAGGGTTCCT AGTGTGGGCT GGACAGAGGA 1362 

His Lys Phe Arg Asp 

375 

GAAGCGCAAG CATCATGGAG ACGTGGGTGG GCGGAGGATG AATGGTGCCT TGCTCATTCT 1422 
TGAGTAGCAT TAGGGTATTT CAAAACTGCC AAGGGGCTGA TGTGGACG6A CAGCAGCGCA 1482 
GCCGCAGTTG GAGAATGCCA AGGGGCTGAT GTGGACGGAC AGCAGCGCAG CCGCAGTTGG 1542 
AGAAGACTTC GCTTCATAGT ACTGGAGCGG GCATTATTGC TCCATATTGG AGCATGTTTA 1602 
CGGATGACGT TCTGTTTTCT GTTTGTAAAT TATTTGCTAA GTGTATTTTT TTGCTCCAGA 1662 
CCCCCCCCCC CCCTTTCTTG GTTCTACAAT TGTAATAGAG ACAAAATAAG ATTAGTTGGG 1722 
CCAAGTGAAA GCCCTGCTTG TCCTTTGACA GAAGTAAATG AAAGCGCCTC TCATTCCTTC 1782 
CCGAGCGGAG GGGGGACACT CTGTGAGTGT CCTTGGGGCA GCTACCTGCA CTCTAAAACT 1842 
GCAAACAGAA ACCAGGTGTT TTAAGATTGA ATGTTTTTTT ATTTATCAAA GTGTAGCTT7 1902 
TGGGGAGGGA GGGGAAATGT AATACTGGAA TAATTTGTAA ATGATTTTAA TTTTATATCA 1962 
GTGAAGAGAA TTTATTTATA AAATTAATCA TTTAATAAAG AAATATTTAC CTAAAAAAAA 2022 
****** FIG URE 1 fCont.1 2 028 
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0IG-K2 CONSENSUS 112790 

AGACTCAGCC AGATCCACTC CAGCTCCGAC CCCAGGAGAC CGACCTCCTC CAGACGGCAG 60 

CAGCCCCAGC CCAGCCGACA ACCCCAGACG CCACCGCCTG GAGCGTCCAG ACACCAACCT 120 

CCGCCCCTGT CCGAATCCAG GCTCCAGCCG CGCCTCTCG7 CGCCTCTGCA CCCTGCTGTG 180 

CATCCTCCTA CCGCGTCCCG ATC ATG CTC GCC TCC GTC GCA GGT CCC 227 

Met Leu Ala Ser Val Ala Gly Pro 
1 5 

ATC AGC CTC GCC TTG GTG CTC CTC GCC CTC TGC ACC CGG CCT GCT 272 
He Ser Leu Ala Leu Val Leu Leu Ala Leu Cys Thr Arg Pro Ala 
10 15 20 

ACG GGC CAG GAC TGC AGC GCG CAA TGT CAG TGC GCA GCC GAA GCA 317 
Thr Gly Gin Asp Cys Ser Ala Gin Cys Gin Cys Ala Ala Glu Ala 
25 30 35 

GCG CCG CAC TGC CCC GCC GGC GTG AGC CTG GTG CTG GAC GGC TGC 362 
Ala Pro His Cys Pro Ala Gly Val Ser Leu Val Leu Asp Gly Cys 
40 45 50 

GGC TGC TGC CGC GTC TGC GCC AAG CAG CTG GGA GAA CTG TGT ACG 407 
Gly Cys Cys Arg Val Cys Ala Lys Gin Leu Gly Glu Leu Cys Thr 
55 60 65 

GAG CGT GAC CCC TGC GAC CCA CAC AAG GGC CTC TTC TGC GAT TTC 452 
Glu Arg Asp Pro Cys Asp Pro His Lys Gly Leu Phe Cys Asp Phe 
70 75 80 

GGC TCC CCC GCC AAC CGC AAG ATT GGA GTG TGC ACT GCC AAA GAT 497 
Gly Ser Pro Ala Asn Arg Lys He Gly Val Cys Thr Ala Lys Asp ■ 
85 90 95 

GGT GCA CCC TGT GTC TTC GGT GGG TCG GTG TAC CGC AGC GGT GAG 542 
Gly Ala Pro- Cys Val Phe Gly Gly Ser Val Tyr Arg Ser Gly Glu 
100 105 no 



TCC TTC CAA AGC AGC TGC AAA TAC CAA TGC ACT TGC CTG GAT GGG 
Ser Phe Gin Ser Ser Cys Lys Tyr Gin Cys Thr Cys Leu Asp Gly 
US 120 125 



587 



GCC GTG GGC TGC GTG CCC CTA TGC AGC ATG GAC GTG CGC CTG CCC 632 

Ala Val Gly Cys Val Pro Leu Cys Ser Met Asp Val Arg Leu Pro 

I 30 135 !4 0 

FIGURE 2 
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AGC CCT GAC TGC CCC TTC CCG AGA AGG GTC AAG CTG CCT GGG AAA 677 
Ser Pro Asp Cys Pro Phe Pro Arg Arg Va! Lys Leu Pro Gly Lys 
145 150 155 

TGC TGC GAG GAG TGG GTG TGT GAC GAG CCC AAG GAC CGC ACA GCA 722 
Cys Cys Glu Glu Trp Val Cys Asp Glu Pro Lys Asp Arg Thr Ala 
160 165 170 

GTT GGC CCT GCC CTA GCT GCC TAC CGA CTG GAA GAC ACA TTT GGC 767 
Val Gly Pro Ala Leu Ala Ala Tyr Arg Leu Glu Asp Thr Phe Gly 
17 5 180 185 

CCA GAC CCA ACT ATG ATG CGA GCC AAC TGC CTG GTC CAG ACC ACA 812 
Pro Asp Pro Thr Met Met Arg Ala Asn Cys Leu Val Gin Thr Thr 
190 195 200 

GAG TGG AGC GCC TGT TCT AAG ACC TGT G6A ATG GGC ATC TCC ACC 857 
Glu Trp Ser Ala Cys Ser Lys Thr Cys Gly Met Gly He Ser Thr 
205 210 215 



CGA GTT ACC AAT GAC AAT ACC TTC TGC AGA CTG GAG AAG CAG AGC 
Arg Val Thr Asn Asp Asn Thr Phe Cys Arg Leu Glu Lys Gin Ser 
220 225 230 



902 



CGC CTC TGC ATG GTC AGG CCC TGC GAA GCT GAC CTG GAG GAA AAC 947 
Arg Leu Cys Met Val Arg Pro Cys Glu Ala Asp Leu Glu Glu Asn 
235 240 245 

ATT AAG AAG GGC AAA AAG TGC ATC CGG ACA CCT AAA ATC GCC AAG 992 
He Lys Lys Gly Lys Lys Cys He Arg Thr Pro Lys lie Ala Lys 
250 255 260 

CCT GTC AAG TTT GAG CTT TCT GGC TGC ACC AGT GTG AAG ACA TAC 1037 
Pro Val Lys Phe Glu Leu Ser Gly Cys Thr Ser Val Lys Thr Tyr 
265 270 275 

AGG GCT AAG TTC TGC GGG GTG TGC ACA GAC GGC CGC TGC TGC ACA 1082 
Arg Ala Lys Phe Cys Gly Val Cys Thr Asp Gly Arg Cys Cys Thr 
280 285 290 

CCG CAC AGA ACC ACC ACT CTG CCA GTG GAG TTC AAA TGC CCC GAT 1127 
Pro His Arg Thr Thr Thr Leu Pro Val Glu Phe Lys Cys Pro Asp 
295 300 305 

GGC GAG ATC ATG AAA AAG AAT ATG ATG TTC ATC AAG ACC TGT GCC 1172 
Gly Glu He Met Lys Lys Asn Met Met Phe lie Lys Thr Cys Ala 
310 315 320 

FIGURE 2 (Conn 
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TGC CAT TAC AAC TGT CCT GGG GAC AAT GAC ATC TTT GAG TCC CTG 1217 
Cys His Tyr Asn CysPro G1y Asp Asn Asp He Phe Glu Ser Leu 
325 330 335 

TAC TAC AGG AAG ATG TAC GGA GAC ATG GCG TAAAGCCAGG AAGTAAGGGA 1267 
Tyr Tyr Arg Lys Met Tyr Gly Asp Met A1a 

340 345 
CACGAACTCA TTAGACTATA ACTTGAACTG AGTTGCATCT CATTTTCTTC TGTAAAAACA 1327 

ATTACAGTAG CACATTAATT TAAATCTGTG TTTTTAACTA CCGTGGGAGG AACTATCCCA 1387 

CCAAAGTGAG AACGTTATGT CATGGCCATA CAAGTAGTCT GTCAACCTCA GACACTGGTT 1447 

TCGAGACAGT TTACACTTGA CAGTTGTTCA TTAGCGCACA GTGCCAGAAC GCACACTGAG 1507 

GTGAGTCTCC TGGAACAGTG GAGATGCCAG GAGAAAGAAA GACAGGTACT AGCTGAGGTT 1567 

ATTTTAAAAG CAGCAGTGTG CCTACTTTTT GGAGTGTAAC CGGGGAGGGA AATTATAGCA 1627 

TGCTTGCAGA CAGACCTGCT CTAGCGAGAG CTGAGCATGT GTCCTCCACT AGATGAGGCT 1687 

GAGTCCAGCT GTTCTTTAAG AACAGCAGTT TCAGCTCTGA CCATTCTGAT TCCAGTGACA 1747 

CTTGTCAGGA GTCAGAGCCT TGTCTGTTAG ACTGGACAGC TTGTGGCAAG TAAGTTTGCC 1807 

TGTAACAAGC CAGATTTTTA TTGATATTGT AAATATTGTG GATATATATA TATATATATA 1867 

TATATTTGTA CAGTTATCTA AGTTAATTTA AAGTCATTTG TTTTTGTTTT AAGTGCTTTT 1927 

GGGATTTTAA ACTGATAGCC TCAAACTCCA AACACCATAG GTAGGACACG AAGCTTATCT 1987 

GTGATTCAAA ACAAAGGAGA TACTGCAGTG GGAATTGTGA CGTGAGTGAC TCTCTGTCAG 204 7 

AACAAACAAA TGCTGTGCAG GTGATAAAGC TATGTATTGG AAGTCA6ATT TCTAGTAGGA 2107 

AATGTGGTCA AATCCCTGTT GGTGAACAAA TGGCCTTTAT TAAGAAATGG CTGGCTCAGG 2167 

GTAAGGTCCG ATTCCTACCA GGAAGTGCTT GCTGCTTCTT TGATTATGAC TGGTTTGGGG 2227 

TGGGGGGCAG TTTATTTGTT GAGAGTGTGA CCAAAAGTTA CATGTTTGCA CTTTCTAGTT 2287 

GAAAATAAAG TATATATATA TTTTTATATG AAAAAAAAAA AAA 2330 
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CEF10 - HGSAGARP-ALAAALLCLARLALGSPCPAVCQCPAAAPQCAPGVGLVPDG -49 

/JIG-MI - MSSSTFRTLAVAVTLLHLTRLAL-STCPAACHCPLEAPKCAPGVGLVRDG -49 

CEFIO - CGCCKVCAKQLNEDCSRTQPCDHTKGLECNFGASPAATNGICRAQSE6RP -99 

0IG-M1 - CGCCKVCAKQLNEDCSKTQPCDHTKGLECNFGASSTALKGICRAQSEGRP -99 

CEFIO - CEYNSKIYQNGESFQPNCKHQCTCIDGAVGCIPLCPQELSLPNLGCPSPR -149 

0IG-M1 - CEYNSRIYQNGESFQPNCKHQCTCIDGAVGCIPLCPQELSLPNLGCPNPR -149 

CEFIO - LVKVPGQCCEEWVCDES — KDALEELEGFFSKEFGLDASEGELTRNNEII -197 

0IG-M1 - LVKVSGQCCEEWVCDEDSIKDSLDDQDDLl GLDASEVELTRNNELI -195 

CEFIO - AIVKGG-IKMIPVFGSEPQSRAFENP KCIVQTTSWSQCSKTCGT -240 

0IG-M1 - AIGKGSSLKRLPVFGTEP--RVLFNPLHAHGQKCIVQTTSWSQCSKSCGT -243 
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0IG-M1 - GISTRVTNDNPECRLVKETRICEVRPCGQPVYSSLKKGKKCSKTKKSPEP -293 

CEFIO - VRFTYAGCSSVKKYRPKYCGSCVDGRCCTPQQTRTVKIRFRCDDGETFTK -340 

0IG-M1 - VRFTYAGCSSVKKYRPKYCGSCVDGRCCTPIQTRTVKMRFRCEOGEMFSK -343 

•CEFIO - SVMMIQSCRCNYNCPHANEA-YPFYRIVNDIHKFRD -375. 

0IG-H1 - NVMMIQSCKCNYNCPHPNEASFRLYSIFNDIHKFRD -379 
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0IG-H1 - MSSSTFRTLAVAVTLLHL-TRLALST-CPAACHCPLEA-PKCAPGVGLVR -47 

0IG-M2 - MLASVAGPisLALVLlALCTRPATGQOCSAQCQCAAEAAPHCPAGVSLVL -SO 

0IG-M1 - DGCGCCKVCAKQLNEDCSICTQPCOHUGLECNFGASSTALKGICflAQSEG -97 

•••••• * • • • 

• ••■••••••••a » a a a ■ • ••• • 

0IG-H2 - DGCGCCRVCAKQIGEICTERDPCDPHKGLFCDFGSPANRKIGVCTAK-DG -99 

0IG-M1 - RPCEYNSRIYQNGESFQPNCKHQCTCIDGAVGCIPICPQELSIPNLGCPN -147 

0IG-M2 - APCVFGGSVYRSGESFQSSCKYQCTCLDGAVGCVPICSHDVRIPSPDCPF -149 

0IG-M1 - PRIVKVSGQCCEEWVC0EDSIKDSIDDQD0LLGLOASEVELTRNNELIAI -197 
0IG-M2 - PRRVKLPGKCCEEWVCDEPKDRTAVG PALAAYRLEOT 186 

0IG-M1 - GKGSSLKRLPVFGTEPRVLFNPLHAHGQKCIVQTTSWSQCSKSCGTGIST -247 
0IG-M2 FGPDP TMHRAN — CLVQTTEWSACSKTCGHGIST -218 

0IG-H1 - RVTNONPECRLVKETRICEVRPCGQPVYSSLKKGKKCSKTKKSPEPVRFT -297 

0IG-H2 - RVTNDNTFCRLEKQSRLCHVRPCEADlEENIKKGKKCIRTPKIAKPVKFE -268 
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0IG-M1 CIVQTTSWSQCSKSCGTGISTRVT NDNPECRL-VKETRICEVR 42 

CEF12CS CIVQTTSWSQCSKTCGTGISTRVT NDNPDCKL-IKETRICEVR 42 

0IG-M2 CLVQTTEWSACSKTCGHGISTRVT NDNTFCRL-EKQSRLCMVR 42 

PFALCIPACS NSI-STEWSPCSVTCGNGIQVRIKPGSANKPKDELDYEN-DIEKKICKME 48 

PROPEROCSR WSX-WSPWSPCSVTCSXGXQXXXRXRXCXXPAPXX-GXPCAGXAXXXXXQ 48 

THROMBOCS WSH-WSPWSSCSVTCGDGV — ITRIRLCNSPSPQHNGKPC--ECEARETK 45 

PFALTRAPCS CGV-WDEWSPCSVTCGKGTRSRKREILHEG CTSEIQEQ 37 

C7C0MPCS WOF-YAPWSECN-GCTKTQTRRRSVAVYG QYGGQPCVG--NAFETQ 42 
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